Skip to content

Add in-process MAA token caching to PopKeyAttestor#5887

Open
Robbie-Microsoft wants to merge 11 commits intomainfrom
rginsburg/maa_caching
Open

Add in-process MAA token caching to PopKeyAttestor#5887
Robbie-Microsoft wants to merge 11 commits intomainfrom
rginsburg/maa_caching

Conversation

@Robbie-Microsoft
Copy link
Copy Markdown
Contributor

@Robbie-Microsoft Robbie-Microsoft commented Mar 25, 2026

Summary

MAA (Microsoft Azure Attestation) tokens are acquired during mTLS binding certificate minting, requiring a native DLL call + network round-trip to the MAA endpoint. Since MAA tokens carry their
own JWT expiry and have a longer TTL than binding certs, they can safely be reused across cert re-mints.

This PR adds an in-memory cache for MAA tokens inside PopKeyAttestor to avoid redundant native DLL and network calls.

Key point: when the cert cache is warm, attestation is skipped entirely by the cert-reuse path. The MAA cache activates when the cert cache is cold (first boot, cert expiry, or explicit
eviction) — it ensures that if multiple cert re-mints happen before the MAA token expires, only the first one calls the attestation service.

Changes

PopKeyAttestor.cs

  • Adds a static ConcurrentDictionary<string, AttestationToken> cache keyed by "{endpoint}|{keyId}"
    • endpoint — different MAA regional endpoints have different trust domains
    • keyIdCngKey.KeyName; ensures key rotation does not return a stale token for the new key
    • clientId intentionally omitted — MAA attests the CNG key, not the identity; SAMI and UAMI sharing the same key at the same endpoint correctly share the cached token
  • Key handle validation (null, IsInvalid, SafeNCryptKeyHandle cast) always runs before the cache check
  • Tokens within 5 minutes of expiry are evicted (matches AccessTokenExpirationBuffer); freshness check guards against DateTimeOffset.MinValue - buffer overflow
  • Per-key single-flight via KeyedSemaphorePool prevents thundering-herd redundant attestation calls
  • Cache hit/miss/store logged via ILoggerAdapter threaded from request context through AttestationTokenProvider delegate
  • Static constructor registers ResetCacheForTest with ApplicationBase.RegisterResetCallback so ApplicationBase.ResetStateForTest() automatically clears the MAA cache

ApplicationBase.cs

  • Adds RegisterResetCallback(Action) + ConcurrentBag<Action> s_resetCallbacks
  • ResetStateForTest() invokes all registered callbacks after its own resets

AcquireTokenCommonParameters / AcquireTokenForManagedIdentityParameters

  • AttestationTokenProvider delegate updated to include ILoggerAdapter and string keyId

ImdsV2ManagedIdentitySource.cs

  • Extracts rsaCng.Key.KeyName and forwards it as keyId to the attestation delegate

ImdsV2Tests.cs

  • [TestCleanup] calls PopKeyAttestor.ResetCacheForTest() (belt-and-suspenders alongside the ApplicationBase callback)
  • Adds 6 new tests:
    • MaaTokenCache_Hit_DoesNotCallAttestationProviderAgain
    • MaaTokenCache_ExpiredToken_CallsAttestationProviderAgain
    • MaaTokenCache_DifferentEndpoints_CachedSeparately
    • MaaTokenCache_DifferentKeyIds_CachedSeparately — two different CNG keys at the same endpoint get separate cache entries (key rotation safety)
    • MaaTokenCache_ConcurrentCacheMiss_SingleFlightCallsProviderOnce
    • MaaTokenCache_CertCacheMiss_MaaTokenCacheStillFresh_DoesNotCallAttestationProviderAgain

Test results

81/81 ImdsV2Tests passing on .NET 4.8 and .NET 8, no regressions.

No public API changes

Caching is entirely internal to PopKeyAttestor. WithAttestationSupport() is unchanged.

Cache MAA (Microsoft Azure Attestation) JWT tokens in-process to
avoid redundant native DLL calls and network round-trips to the MAA
endpoint on every mTLS cert mint.

Changes:
- PopKeyAttestor: add ConcurrentDictionary<string, AttestationToken>
  keyed by '{maaEndpoint}|{clientId}'. MAA cache is checked before any
  native or network call. On MAA cache hit the key handle is not
  required. MAA tokens within 5 min of expiry (matching MSAL's
  AccessTokenExpirationBuffer) are considered stale and trigger a fresh
  MAA attestation call. Key validation is deferred to cache-miss path.
  Add ResetCacheForTest() and overridable s_expirationBuffer for tests.
- ImdsV2Tests: call ResetCacheForTest() in TestCleanup. Add three tests:
  - MaaTokenCache_Hit_DoesNotCallAttestationProviderAgain
  - MaaTokenCache_ExpiredToken_CallsAttestationProviderAgain
  - MaaTokenCache_DifferentEndpoints_CachedSeparately

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Robbie-Microsoft Robbie-Microsoft requested a review from a team as a code owner March 25, 2026 16:13
Copilot AI review requested due to automatic review settings March 25, 2026 16:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds in-process caching of Microsoft Azure Attestation (MAA) tokens in PopKeyAttestor to avoid repeated native DLL calls and network round-trips during mTLS certificate minting, and introduces unit tests to validate cache behavior.

Changes:

  • Added a static in-memory token cache keyed by {maaEndpoint}|{clientId} with an expiration buffer.
  • Cached-token fast path now returns without requiring a key handle; cache-miss path performs key validation and attestation.
  • Added/updated ImdsV2Tests to reset the cache and validate cache hit/expiry/separate endpoints behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
tests/Microsoft.Identity.Test.Unit/ManagedIdentityTests/ImdsV2Tests.cs Adds cleanup to reset the new cache and introduces tests for cache hit/expiry/isolation by endpoint.
src/client/Microsoft.Identity.Client.KeyAttestation/PopKeyAttestor.cs Implements token caching, cache lookup before native/network call, and test hooks to reset cache/change expiration buffer.

- Add per-key SemaphoreSlim (s_keyLocks) to prevent concurrent in-flight
  attestation calls for the same cache key (thundering herd protection)
- Double-check cache after acquiring lock so waiters skip the network call
- Evict stale entries in TryGetCachedToken via TryRemove to prevent
  unbounded memory growth
- Replace null with string.Empty in cache-hit AttestationResult message
  for consistency and null-safety
- ResetCacheForTest() now also clears s_keyLocks
- Add MaaTokenCache_ConcurrentCacheMiss_SingleFlightCallsProviderOnce test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@gladjohn gladjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Robbie, looks good overall. Are we able to use any of the cache helper pattern from the cert cache here?

…he key comment, new test

- Move keyHandle validation before cache check (Bogdan)
- Remove redundant ThrowIfCancellationRequested (Bogdan)
- Add ILoggerAdapter param; log cache hit/miss/store (Bogdan + gladjohn)
- Register PopKeyAttestor.ResetCacheForTest with ApplicationBase.ResetStateForTest via callback (gladjohn)
- Add cache key design comment explaining why Key.ID is not included (Bogdan + gladjohn)
- Update AttestationTokenProvider delegate type to thread logger from request context
- Update MaaTokenCache_DifferentEndpoints test: always pass valid handles (no null on cache hit)
- Add MaaTokenCache_CertCacheMiss_MaaTokenCacheStillFresh test (gladjohn)
- Update XML doc comment on PopKeyAttestor class (gladjohn)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 30, 2026 19:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

@Robbie-Microsoft
Copy link
Copy Markdown
Contributor Author

Are we able to use any of the cache helper pattern from the cert cache here?

The cert cache uses a two-tier (in-memory + persistent) design with a full interface hierarchy (ICertificateCache, IMtlsCertificateCache, etc.) because certs need to survive process restarts and are encrypted at rest. The MAA token cache is intentionally simpler — in-memory only, with a ConcurrentDictionary + per-key semaphore for single-flight. Reusing the cert cache pattern would bring in persistence, encryption, and MtlsBindingCache orchestration that aren't warranted for a short-lived JWT. If the requirements change (e.g. persist across reboots), we can adopt that pattern then.

Robbie-Microsoft and others added 2 commits March 30, 2026 15:40
…erflow, test fixes

- Add ArgumentNullException guard to ApplicationBase.RegisterResetCallback
- Replace hand-rolled ConcurrentDictionary<string,SemaphoreSlim> with KeyedSemaphorePool
- Fix TryGetCachedToken: flip comparison to avoid DateTimeOffset.MinValue - buffer overflow
- MaaTokenCache_Hit: add WithForceRefresh + mocks so second acquire exercises cert/MAA path
- MaaTokenCache_Concurrent: release gate for all tasks.Length so regression fails fast instead of hanging
- MaaTokenCache_CertCacheMiss: add TokenSource.IdentityProvider assertion to prove path was exercised

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Robbie-Microsoft and others added 2 commits March 30, 2026 16:11
…l is stateless)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Include the CNG key name (CngKey.KeyName) as a third component of the
MAA token cache key, changing the format from '{endpoint}|{clientId}'
to '{endpoint}|{clientId}|{keyId}'.

This prevents a stale MAA token for one CNG key from being returned
for a different key that shares the same endpoint+clientId (e.g. after
key rotation).

Changes:
- PopKeyAttestor: add optional keyId param to AttestCredentialGuardAsync,
  update s_testAttestationProvider delegate type, update BuildCacheKey,
  rewrite cache-key design comment
- AcquireTokenCommonParameters / AcquireTokenForManagedIdentityParameters:
  update AttestationTokenProvider delegate type
- ImdsV2ManagedIdentitySource: extract rsaCng.Key.KeyName and forward
  as keyId to the attestation delegate
- ManagedIdentityAttestationExtensions: lambda accepts and forwards keyId
- Test helpers (ManagedIdentityTestExtensions, TestAttestationProviders):
  update all delegate signatures
- ImdsV2Tests: update existing lambdas and direct AttestCredentialGuardAsync
  calls; add MaaTokenCache_DifferentKeyIds_CachedSeparately test

Addresses review feedback from bgavrilMS in PR #5887.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 7, 2026 21:45
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Robbie-Microsoft and others added 2 commits April 7, 2026 17:55
Cache key changes from '{endpoint}|{clientId}|{keyId}'
to '{endpoint}|{keyId}'.

MAA attests the CNG key itself; clientId is forwarded to the attestation
service as metadata but does not change which token is valid for a given
key. Different managed identities (SAMI vs UAMI) sharing the same CNG key
at the same endpoint should share the cached token.

Updated the cache key design comment to explain:
- why clientId is intentionally excluded
- why a single static field (gladjohn's suggestion) is insufficient once
  keyId is in play (multiple endpoint|keyId combinations possible)

keyId and endpoint remain in the key per bgavrilMS and gladjohn feedback.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove reference to 'clientId' in the test comment; the cache key
is now '{endpoint}|{keyId}' and clientId is no longer a cache dimension.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 7, 2026 21:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.

…emaphore comment, test keyId fixes

- Switch Info(string) to Info(Func<string>) in PopKeyAttestor to avoid eager string allocation
- Add NormalizeEndpoint() helper that trims trailing slash before composing cache key
- Add comment explaining KeyedSemaphorePool stays bounded (named keys only; ephemeral share one slot)
- Fix MaaTokenCache_DifferentEndpoints_CachedSeparately: use const string keyIds (rsa.Key.KeyName is null for ephemeral keys)
- Fix MaaTokenCache_ConcurrentCacheMiss_SingleFlightCallsProviderOnce: add explicit keyId argument

82/82 ImdsV2Tests pass on .NET 8.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants